Automatic transcription of continuous speech using unsupervised and incremental training

نویسندگان

  • L. Sarada Ghadiyaram
  • Hemalatha Nagarajan
  • T. Nagarajan
  • Hema A. Murthy
چکیده

In [1], a novel approach is proposed for automatically segmenting and transcribing continuous speech signal without the use of manually annotated speech corpora. In this approach, the continuous speech signal is first automatically segmented into syllable-like units and similar syllable segments are grouped together using an unsupervised and incremental clustering technique. Separate models are generated for each cluster of syllable segments and labels are assigned to them. These syllable models are then used for recognition/transcription. Even though the results in [1] are quite promising, there are some problems in the clustering technique due to (i) the presence of silence segments at the beginning and end of syllable boundaries. (ii) fragmentation of syllables (iii) merging of syllables and (iv) poor initialization of syllable models. In this paper we specifically address these issues, make several refinements to the baseline system, which has resulted in a significant performance improvement of 8% over that of the baseline system described in [1].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised training and directed manual transcription for LVCSR

A significant cost in obtaining acoustic training data is the generation of accurate transcriptions. When no transcription is available, unsupervised training techniques must be used. Furthermore, the use of discriminative training has become a standard feature of state-ofthe-art large vocabulary continuous speech recognition (LVCSR) system. In unsupervised training, unlabelled data are recogni...

متن کامل

Towards automatic learning in LVCSR: rapid development of a Persian broadcast transcription system

We present a new method for automatic learning and refining of pronunciations for large vocabulary continuous speech recognition which starts from a small amount of transcribed data and uses automatic transcription techniques for additional untranscribed speech data. The recognition performance of speech recognition systems usually depends on the available amount and quality of the transcribed ...

متن کامل

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

In this work the theoretical concepts of unsupervised acoustic model training and the application and evaluation of unsupervised training schemes are described. Experiments aiming at speaker adaptation via unsupervised training are conducted on the KIT lecture translator system. Evaluation takes place with respect to training e ciency and overall system performance in dependency of the availabl...

متن کامل

Automatic Speech Transcription and Archiving System using the Corpus of Spontaneous Japanese

The target of automatic speech recognition (ASR) research has been shifted from read speech to spontaneous speech. The technology will realize automatic transcription (and translation) of lectures and meetings. In Japan, ”Spontaneous Speech” project has been conducted in last five years, and we set up the huge ”Corpus of Spontaneous Japanese (CSJ)”, which consists of over 2000 speeches (500 hou...

متن کامل

Optimizing Data Selection for Automatic Speech Recognition in Low Resource Languages

Developing Automatic Speech Recognition (ASR) systems for low resource languages is a labor-, computation-, and timeintensive task. Data selection techniques seek highly informative subsets of speech data for transcription and can lead to considerable reduction in time and expense for transcription and ASR training. This project investigates unsupervised and supervised data selection techniques...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004